Augmented Cognition Methods And Apparatus For Contemporaneous Feedback In Psychomotor Learning

ABSTRACT

A method of creating a scalable dynamic jointed skeleton (DJS) model for enhancing psychomotor leaning using augmented cognition methods realized by an artificial intelligence (AI) engine or image processor. The method involves extracting a DJS model from either live motion images of video files of an athlete, teacher, or expert to create a scalable reference model for using in training, whereby the AI engine extracts physical attributes of the subject including arm length, length, torso length as well as capturing successive movements of a motor skill such as swinging a gold club including position, stance, club position, swing velocity and acceleration, twisting, and more.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.16/743,337, filed Jan. 15, 2020, now U.S. Pat. No. ______, issued______, 2023, which claims the priority of Provisional Application No.62/792,875, filed Jan. 15, 2019. Each of the foregoing applications isincorporated herein by reference in its entirety.

BACKGROUND OF INVENTION

Studies show that learning speed, proficiency, retention are enhanced by“active learning”, where the student actively participates in theinstructive process. This principle is particularly true when acquiringnew skills involving a coordination of vision and movement as known aspsychomotor learning or kinematics. In athletic training or indeveloping specialized psychomotor skills like learning neurosurgery,rapid feedback is an important element in the learning process. Thisprinciple is depicted in the retention pyramid 1 graphic shown in FIG. 1illustrating various levels of learning and retention[https://en.wikipedia.org/wiki/learning#/media/file:learning_retention_pyramid.jpg]comprising passive learning involving ‘receiving’ 2, active learninginvolving ‘participating’ 3, and active learning involving ‘doing’ 4.

In passive learning involving receiving 2 the participant learns throughlectures, reading, or audiovisual presentations. Retention variesbetween 20% and 75% but typically less than 40%. And although studyingmay assist a student better understand what they are doing, it doeslittle to improve to reveal “how” the task should be performed.Receiving is particularly ineffective in teaching psychomotor skillsinvolving eye-hand coordination since the student has no chance “to tryit out” themselves.

Participating 3, a low level of active learning involves a student beingtested or encouraged to play and perform exercises after being taught bydemonstration or through actively participating in online courses.Retention is better than receiving 2, but rarely exceeds 75%. Inpsychomotor training for sports or manual tasks, the student is stillunable to execute the real motions thereby impairing learning. Throwinga shot put, for example, cannot be learning by online testing. Until aperson picks up a shot put it is difficult to image how heavy it is orhow to throw it without injuring oneself. Tasks such as learning to flyan airplane using computer-based training may emulate the operation ofthe cockpit instruments but still lacks the accelerative forces a pilotfeels in a real airplane or in a professional flight simulator.

Greater learning retention and faster learning involves the studentactively participating in the learning process by doing 4 generallythrough coaching or a live trainer's instruction followed by timelypracticing. In doing based learning, the student alternates betweencoaching instructors showing ‘what to do’ followed by the student tryingto emulate the action when the coach is present and again when they areon their own. Coaching or personal trainer instruction through doing 4generally accelerates a student or athlete's rate of progress whileimproving retention over 75% but suffers from high cost, theinconvenience of scheduling appointments and instructor availability,and by lack of feedback during solitary practice times.

The repeated application of a teacher's demonstration followed by astudent's attempt to repeat the same motions, vis-à-vis a “seeing-doing. . . sequence” forms the basis for active psychomotor learning today.Allegedly the repetitive nature training under a coach inculcatesbehavior via neural programming of a stimulus response (SR) pattern inthe brain. Neural programming facilitating psychomotor learning(popularly referred to as “muscle memory”) occurs in three stagescomprising (i) the cognitive phase, (ii) the associative phase, and(iii) the autonomous phase. In the cognitive phase, educator-coachexplains and describes the task likely with demonstrations and thestudent executes their first attempts.

In the associative phase, the student practices the gesture withdiminishing degrees of coaching or teacher intervention. The associativephase, which comprises the longest interval in psychomotor learning, isat risk that the fledgling athlete may practice the action incorrectly,i.e. learning a bad habit that may be difficult to correct unless theproblem is detected and corrected quickly. In the autonomous phase, thestudent has internalized the skill and is able to perform thepsychomotor task unconsciously. If, however, the skill was learnedincorrectly during the associative phase, it is nearly impossible toundo and correct in the autonomous phase. This fact is purported in theold adage “you can't teach an old dog new tricks”.

Once solution to reduce the risk of a student developing bad psychomotorbehavior during the associative phase is to increase the frequency ofevaluation and active feedback when it's needed most. While such apremise is theoretically sound, pragmatically speaking it is nearlyimpossible to facilitate. For example, after training camp when a rookiefootball player has just learned the coach's plays and strategic“system” and is most amenable to put what they have learned to good use,suddenly the preseason games begin and the coaching staff has no time toconcentrate on a particular player's needs. In essence the rookie isleft to practice on his own when he needs guidance most.

Timing is a key factor in learning and retention. Research in educationand training has shown that the learning process involves two opposingmechanisms—a ‘learning curve’ and a ‘forgetting curve’. Forgettingstarts the moment that instruction stops. As an example, FIG. 2 revealsan exponential decline 10 in retention starting with 58.2% initialretention 11 and declining to 21.1% retention 12 after 31 days[http://www.elearningcouncil.com/learning-theory/overcoming-ebbinghaus-curve-how-soon-we-forget/].As a simple exponential decay model first proposed by Ebbinghaus, themodel characterizes forgetting using a single time constant r indicatingdecay to a retention level 13 representing a magnitude 1/e or roughly37% of the original retention 11. In the example shown, this timeconstant T is approximately 7 hours, a value that varies with thestrength of the initial learning and retention.

Retention can be improved by repetition as illustrated in FIG. 3 whereinitial forgetting curve 16 decaying from 100% of the initial retention15 to a value 17 before a ‘refresher’ recovers the memory 18[https://twitter.com/davidwees/status/984131508058243072], followed by anew forgetting curve 19, repeated thereafter by forgetting curve 20. Thelonger the duration between “practice sessions”, the farther down theforgetting curve the student's retention slides, erasing the benefits ofpreviously and laboriously climbing the learning curve. If the trainingsessions occur infrequently because of instructor availability, theresults of active learning education are not significantly better thanpassive learning.

FIG. 4 illustrates various types of learning retention curves and theirassociated forgetting curves[https://www.researchgate.net/publication/288000042_the_need_for_a_standardized_and_common_way_of_process_training],including learning by classroom training 33 and coaching 30. In theexample shown, classroom learning 33 achieved a higher degree oflearning shown by retention 34 but experienced a faster decline shown byforgetting curve 32 than learning through coaching 30 resulting in apeak retention 31 and a slower forgetting curve 35. One common solutionused by coaches to supplement their personal instruction withaudio-video images and e-training as part of the learning process. Asshown peak retention 31 may decay to 36 then be boosted to retention 37.The overall learning may decline to retention 38 but remains far abovethe coaching-only forgetting curve 35.

The simplest form of electronic training is performed by having astudent watch videos to supplement their personalized training. Whilethis method may be useful for learning topical and conceptual subjectmatter such as TED talks, the method is less beneficial in psychomotorlearning. Although watching videos of Tom Brady throwing a pass, TigerWoods teeing off, or Venus Williams serving a tennis ball might beinsightful to study a pro's form or technique, a student cannot properlypractice their movements while watching the video because they cannotsee themselves. In such cases, the trainee must alternate watching thevideo demonstration then try practicing what they learned. Without theability to make contemporaneous comparisons, however, there is a greaterrisk the student or trainee may learn the psychomotor skill incorrectly.

-   -   Through the use of videos, a side-by-side image comparison of a        professional athlete's movements and a trainee's live image can        be concurrently superimposed using two video images, ostensibly,        facilitating the means for contemporaneous visualization by        superimposing the student's motion image on top of or beside the        pro athlete while repeatedly looping the motion. Video image        overlays for A-B comparisons can be displayed using commonly        available display mediums shown in FIG. 5 such as flat panel TVs        and monitors 40, virtual reality (VR) goggles 41, or glasses        with projection or mirrors as a heads-up display 42. While the        idea sounds appealing, these video-based image overlays don't        work well in psychomotor training for innumerable reasons, one        of the most significant being the fact that the heights of two        people are unlikely to match, making it difficult to overlay the        images in a meaningful or useful way. The height mismatch        problem is illustrated in FIG. 6 where golfer 50 with a height        of 170 cm is displayed alongside a second golfer 60 having a        height of 201 cm. As shown, the height difference of two golfers        makes instructive comparison difficult as every key body feature        differs in its length, including spine lengths 51 versus 61 and        waist height 57 versus 67.    -   Moreover, different height athletes require different sized        equipment. In order for shorter golfer 50 to maintain the same        desired angle θ between golf club 52 and ground 59 as tall        golfer 60 achieves with golf club 62, i.e. to match angles 56        and 66, the length 55 of club 52 must be shorter than the length        65 of club 62. Other differences between club 62 and 52 likely        include the length of the grip, the shape of the head, and the        club's construction. As such, the two images cannot be simply        superimposed but must be scaled in dimension.    -   Simple linear scaling to match a reference video to a live        camera view can also be problematic. In FIG. 7 for example,        video 70 of a pro golfer 71 taking from a distance, possibly in        a golf tournament, must be scaled to a larger dimension to        facilitate an A-B comparison to a live golfer. If however the        image resolution of the original video is insufficient, a common        case for older videos, scaling the image of pro golfer 71 to a        larger size 72 invariably will cause image ‘pixilation’,        resulting in a blurry image useless for study. Aside from        resolution issues, the angle of the original reference video        source may be a problem. As depicted in FIG. 8 , the line of        sight 77 of video camera 76 is only able to capture an image of        the backside of golfer 75. In such cases, a trainee is unable to        see what golfer 75 is doing prior to his swing, rendering an A-B        video comparison useless.    -   Even in cases where a reference video of a pro golfer employs a        relevant camera angle and is sufficient in resolution to scale        without pixilation, video image overlays can be problematic.        Golf swing training        [http://oukas.info/?u=ben+hogans+golf+swing+secret] relying on        A-B image comparisons can be aggravated by dissimilar body        proportions even for golfers of comparable heights. For example,        FIG. 9 illustrates images of two golfers of similar heights,        golfer trainee 80 and golfer reference 85, but with different        body proportions. For example trainee 88 has a slightly longer        torso 81, shorter leg length 83, and a lower waist 82 while        reference 85 has a shorter torso 86, a longer leg length 87, and        a higher waist 88. Because of the differing proportions, a        meaningful visual overlay for active learning is difficult to        achieve even if the height of reference 85 is linearly scaled to        perfectly match trainee 80. Linear scaling means every dimension        is scaled in the same ratio from its original. A 10% linear        shrink, for example means the image height is scaled to 90% of        its original, as are the length of the torso, arms, legs, etc.    -   Linear image scaling, however, assume the length of a human's        body and appendages are proportionate to their height. FIG. 10        however illustrates this simplifying assumption is not correct.        A sample of body silhouettes ranked monotonically from        silhouette 90 a to 90 f in order of diminishing height reveals a        diverse range of height present across the human species,        exemplified by lines 91 a, 91 e and 91 f. For example, height 91        e corresponding to the top of the head of silhouette 90 e        intersects the neck portions of silhouettes 90 a and 90 b while        intersecting the cranial portions of silhouettes 90 c and 90 d.        Height 91 f corresponding to the top of the head of shortest        silhouette 90 f barely covers the shoulders of silhouettes 90 d        and only reaches the torso of silhouettes 90 b, 90 c, and 90 d.        As such there is no easy way to compare the motion of a golfer        having the body build of silhouette 90 a to that of silhouette        90 f. Moreover, the length of a person's thorax (trunk), arms,        and legs do not necessarily scale proportionately with their        height.    -   One important attribute in golfing is rotation of the spine        during a golf swing. While the upper twelve vertebrae of the        spine known as the thoracic spine can twist as much as 40°, the        spine's lower portion, the lumbar attached to the top of the        pelvis and sacrum corresponding to the bottom S1 to S5 vertebrae        of the spine can only rotate around 12° without the risk of        over-extension        [https://experiencelife.com/article/expert-answers-on-spine-rotation/].        Since this inflexible portion of the spine limits the spine's        ability to twist, the height of the lumbar above the ground is        an important factor in determining and limiting an individual's        golf swing. This consideration means it is ill advised to train        a short golfer using a tall golfer as a reference. As an        indirect measure of a person's leg and hip length, the height of        the lumbar does not necessarily vary in proportion to an        athlete's height. For example, lumbar height 92 b of silhouette        90 b is quite similar to silhouettes 90 a despite their        corresponding differences in height. Similarly the lumbar height        92 e of silhouette 90 e is similar to silhouettes 90 c and 90 d        despite their height differences. The lumbar height 92 f of        silhouette 90 f is shorter than the leg length of most of the        other silhouettes shown.    -   Another consideration is the distance between a golfers hands        and the ground. This height affects their swing and also        influences their selecting a suitable golf club. For example        height 94 a of hands 93 a corresponds to the hips of silhouette        90 a, height 94 e of hands 93 e are above the hips of silhouette        90 e, while height 94 f of hands 93 f are below the hips of        silhouette 90 f. Accordingly there can be no predefined        assumption about the length of a persons arms, legs, thorax, or        height, meaning linear image scaling cannot be used to achieve        contemporaneous visualization superimposing a live image atop of        a reference image.    -   The complex motions required by athletes in a tennis serve,        pitching a baseball, or hitting a golf ball further exacerbates        the problem of scaling video image overlays. The golf drive        shown in images 98 a through 98 f of FIG. 11 illustrates the        complex motion involved comprising aligning the club 98 a;        beginning a backswing 98 b; fully extending the club behind the        head 98 c; beginning the downswing 98 d, approaching contact at        the bottom of the downswing 98 e;    -   striking the ball 98 f; and comfortably completing the swing        carried by the clubs momentum 98 g. Aside from the        aforementioned scaling issues, this complex motion is extremely        difficult to mimic, especially synchronized to a video loop.    -   FIG. 12 depicts the motion of the described golf swing of golfer        100 graphically in sequence        [http://lh3.ggpht.com/_hk_qorjidk/sv7u21yrkyi/aaaaaaaaahu/bfll7hxsmd0/s1600-h/image        %5b7%5d.png] by tracking the rotational speeds of the gofer's        pelvis 101 a, thorax 102 a, arm 103 a, and club 104 a (shown in        °/s) with corresponding curves 101 b, 102 b, 103 b, and 104 b.        In the graph, a backswing motion is represented as a negative        number and a positive number represents the downswing. A zero        rotational speed shown by time 105 corresponds to maximum height        extension of the backswing just prior to commencing the        downswing, a position shown by position 98 c in FIG. 11 . The        analysis shows acceleration in the backswing followed by a brief        moment to change direction at time 105 and then a rapid forward        acceleration.    -   The challenge of a golfer trying to match a pro golfer's swing        in a reference video in complex, especially when the video        endlessly loops. This issue is illustrated by the graphic of        FIG. 13 comparing an early swing and late swing to the reference        swing the trainee wishes to emulate. In the early swing the        trainee starts the swing before the video golfer commences,        reaching the fully extent of the backstroke shown at time 106 a        well before the reference video golfer reaches the same point in        the swing at time 105. Unless the trainee interrupts his swing        at time 106 a, an action potentially causing injury, then the        trainee's downswing will commence ahead of reference downswing        104 b completing well ahead of the video and rendering the        comparison useless. In the case of a late swing example shown,        the trainee performs their backswing relatively synchronize to        the reference swing but then takes longer to commence the        downswing at time 107 a and not reaching peak positive        acceleration 107 b until well after reference swing 107 b. Since        the swing ends late, chances are the golfer will miss the next        practice swing in the video loop too because the video will be        starting the next loop while the trainee is still completing the        last swing. In short there is no way to naturally synchronize        the trainee to the static video loop.    -   Another set of problems in visual overlays for psychomotor        training occurs in the means by which to overlay the video        content and the real image. As shown in FIG. 14 , raw video        images of the trainee 110 a and of the reference video 110 b are        displayed side-by-side on the screen or in the VR goggle. Since        the color, size, location of trees, angle of the golfer's stance        in the reference video all differ from the live video stream,        there is really no way for the trainee to naturally perceive the        images or compare the actions. In essence, the method is like        watching two different but similar movies at the same time. The        only result of this A-B video image overlay visualization method        will be the golfer getting a headache. The problem with such an        approach is that both images show information the golfer doesn't        care about (like trees) but clutters the images obscuring clear        visuals ad important information the trainee wants to see, i.e.        comparative body movements.    -   Using optical processing to highlight the golfers relative to        their backgrounds doesn't really help either. For example, in an        attempt to improve image contrast, processed image 111 a        comprises the content of raw image 110 a after performing a        mathematical operation for edge detection. Although the process        does recognize portions of the golfer, parts of the neck 113 a        merge into the image of the golf course trees, indicator lines        used to show club angles such as line 112 a are also lost in the        background image, and portions of the leg 114 a are lost        altogether (a dashed line was added to clarify the portion        lacking any edge detection). The same problems occur for the        edge detection image 111 b processed from the raw reference        video 110 b, except that the video may be even worse. As in the        love video portions of the golfer's back 113 b and indicator        lines 112 b merge into the background landscape, portions of the        leg 114 a are missing, and even part of the golf club 112 c        disappears.    -   Using a ghost video image overlay, i.e. a transparent copy of        the two golfers 121 a and 121 b superimposed on a merged image        background as shown in image 120 of FIG. 15 only makes matters        worse in that (i) it becomes more difficult to see the golfers,        and (ii) the backgrounds merge together further reducing image        contrast. Applying edge detection to create processed image 122        fails altogether, unable to identify portions of a golfer's back        123 a, head 124 b, legs 125 a, arms 126 b, and even the clubs        127 a, and 127 b. As such edge detection is useless in        processing ghost video image overlays.    -   In summary A-B image overlays are not useful for psychomotor        learning for a variety of uncorrectable conditions including the        following:    -   The student's body size does not match the size of the expert,        e.g. one is taller, bigger, etc.;    -   Available videos of the expert may be limited to poor resolution        and taken at great distances, i.e. comprising a small image size        that can't be scaled to a larger size;    -   Available videos of the expert may be taken at an angle        incapable of producing a face forward motion images usable for        training;    -   The expert's physical features are disproportionate compared to        the student so that even if the height is scaled, the legs,        arms, or torso doesn't match the student, i.e. images don't        scale proportionally;    -   Visual overlays using videos lack the ability to synchronize the        motion images of the expert and student, uncomfortably forcing        the student to mimic the video in speed and repetition rate;    -   Videos lack metrics of the expert's motion and of the student        attempts, and offer no means to quantitatively compare the two.

The impact of these aforementioned limitations is that video basedvisual overlays have limited utility in psychomotor learning. The use ofheads-up displays and augmented reality glasses adapted from video games(i.e. augmented reality) do not correct the fact that the sourcematerial, i.e. videos of professionals and experts, is unsuitable inboth form and function for psychomotor learning.

Other attempts to gather information about a student's psychomotortraining progress have involved the use of sensors embedded in a bodysuit or vest. The sensors may be used for force (acceleration)measurements of for simple motion tracking. Motion tracking, aninnovation adapted from filmmaking has been used extensively tomore-realistically animate the movement of non-human characters such asGollum in Lord of the Rings. While motion tracking suits are amendableto film making within the controlled environment of a closed movie set,adapting such methods to outdoor sports and competitive athleticsremains problematic for many reasons including cost, complexity,durability (breakage), and inconvenience. Moreover, body suits limit therange of motion of an athlete affecting a golfer's swing, a pitcher'sthrow, and running back's mobility, etc. Despite the hype, wearableelectronics have not been commercially successful. Not only are wearableelectronics difficult to clean and maintain hygienically, the fieldfailure rate of the devices is orders of magnitude higher than rigidenclosure electronics.

Another method to obtain active feedback data for psychomotor traininginvolves installing sensors into sports apparatus, for example adding aradio transmitter and sensor into golf balls, baseballs, footballs,soccer balls; to place them on golf clubs, baseball bats, cricketrackets, hockey sticks; or into ice skates, skis, sails, etc. Thisstrategy also raises many issues. For example, adding a sensor into aball can change the trajectory of the ball. Training using sensor ballswhile playing games with non-sensor balls may aversely impact anathlete's performance in real competition. It is also unclear if anysports teams will embrace the idea of allowing the use of sensors inballs or clubs. Even if the teams like the idea, it remains unclear ifthe use of sensors violates league regulations. The use of activesensors in school and professional sports competition also opens thepossibility of data hacking. In such scenarios, hacked sensor data couldbe used to challenge an official's ruling, cast doubt on the winner of agame, or lead to litigation. Hacked sensor data could also be used toinfluence gambling in legal and illicit sport books.

In summary, there is no technology available today well suited forimproving psychomotor training and learning. What is needed is aconvenient yet effective means to provide active feedback to students orathletes striving to learn new skills or autonomously improve theirperformance using psychomotor learning without the need for a personalcoach, trainer, instructor, or by requiring the use of any complex,expensive, or invasive apparatus or sensors.

SUMMARY OF THE INVENTION

In summary, a novel method of creating a scalable dynamic jointedskeleton (DJS) model for enhancing psychomotor leaning using augmentedcognition methods realized by an artificial intelligence (AI) engine orimage processor is described. The method involves extracting a DJS modelfrom either live motion images or video files of an athlete, teacher, orexpert to create a scalable reference model for using in training,whereby the AI engine extracts physical attributes of the subjectincluding arm length and torso length as well as capturing successivemovements of a motor skill such as swinging a gold club includingposition, stance, club position, swing velocity and acceleration,twisting, and more. During practice, a live image of a student orathlete in training is captured by a video camera or smartphone andprocessed by an artificial intelligence (AI) engine and analyzed toextract model parameters required to scale the DJS model to match thesize and shape, i.e. the physical attributes, of the live athlete'simage.

The DSJ model is then scaled in size and played dynamically like a movieframe by frame overlaid upon the real-time video image of the liveathlete. The composite image of the DJS model playback and the livecamera image of the athlete are overlaid and displayed on a heads-updisplay, standing screen, augmented reality glasses or other visualdevice, connected via wired or wireless link, allowing the student toobserve their motions in comparison to that of the reference model'sactions, accelerating learning by providing real-time visual feedback,especially during practice comprising repeated attempts. For enhancedreal time performance the live image of the athlete transmitted fordisplay may comprise a low-resolution image or even a glow or silhouettewhile the DJS model's skeleton my comprise a white or contracting colorimage. Unfiltered high-resolution camera data may be retained anduploaded to a database later for more detailed analysis.

In repeated loop training, the DJS model can be looped repeatedly witheach playback cycle as triggered by the athlete commencing action, e.g.starting their backswing. In another embodiment used for synchronizationto discontinuous actions, the DSJ model playback can be triggered bymore than one event, e.g. commencing a first playback sequencecorresponding to the beginning of a golfer's backswing, stoppingplayback at the top of the swing for an indeterminate period, thentriggering a second DJS playback sequence of the drive commencing at thebeginning of the downswing.

Using methods of enhanced cognition, i.e. where the AI engine learnsabout the student while the student is learning from the psychomotortraining system, the AI engine can initially adapt to more closely matchthe student's movements then gradually change to the reference DJSmodel's swing playback to more accurately reflect the expert's stroke.In addition to dynamic adjustment of the swing, the DJS model can alsobe rotated in a range of 360° to closely match the camera angle of thelive image source. In another embodiment, the live image may comprisevideo images from multiple cameras. Other information may also becollected from sources other than a video camera, including a balllaunch monitor using LIDAR or ultrasound, or from sensors detecting ballposition, club velocity, and tilt (torque). This information can be usedto improve the instructive value of the DJS model playback and tocompensate for systematic errors such as hitting the ball off angle,e.g. slicing the ball.

In another embodiment swing dynamics and/or ball trajectory can be usedto assess an athlete's performance, project where a ball will land in acourse, or be scored against an expert or against other golfers in agame or tournament. The performance criteria may also be used forgamification or gambling applications or for earning credits or tokensredeemable at participating golf courses or player leagues.

The same method of augmented cognition for psychomotor learning can beapplied to other sports such as tennis or baseball, or used in trainingstudents in technical motor skills such as surgery, chiropracticmethods, product assembly, or even engine repair.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 : Diagram of learning retention pyramid.

FIG. 2 : Graph of Ebbinghaus forgetting curve.

FIG. 3 : Graph of Ebbinghaus forgetting curve with repeated reviews.

FIG. 4 : Graph of various learning and forgetting curves.

FIG. 5 : Examples of visual display devices.

FIG. 6 : Comparison of different height golfers.

FIG. 7 : Pixilation resulting from image scaling.

FIG. 8 : Poor angle reference image capture.

FIG. 9 : Comparison of two similar sized golfers.

FIG. 10 : Comparison of various body types in height and proportion.

FIG. 11 : Illustration of a stepwise golf swing.

FIG. 12 : Graph of rotational speed of a golf swing.

FIG. 13 : Graph of rotational speeds of various golf swings.

FIG. 14 : A-B side-by-side video image overlays comprising raw imagesand edge detection.

FIG. 15 : A-B ghost image video overlay comprising raw images and edgedetection.

FIG. 16 : Diagram of modified learning retention pyramid includingaugmented cognition.

FIG. 17 : Flow chart of image capture sequence and model extraction.

FIG. 18 : Filtering used to remove superfluous content from videoimages.

FIG. 19 : DJS model extraction of a human.

FIG. 20 : Alternative DJS models useful for human motion modeling.

FIG. 21 : Demonstration of a simple jointed skeleton model emulatingwalking.

FIG. 22A: Conversion of a video capture of a golf swing into a dynamicjointed skeleton sequence.

FIG. 22B: Graph of golf club velocity for a smooth continuous-motionswing.

FIG. 22C: Graph of golf club velocity for a discontinuous-motion swingwith a variable delay between backswing and downswing.

FIG. 23 : DJS model parameter extraction to generate compact dynamicmodel.

FIG. 24 : Curve fitting of various mathematical models to measured data.

FIG. 25 : File generation comprising reference video with extracted DJSmodel, course map, and file data

FIG. 26 : AI based model visualization UX including rotation and frameinterpolation.

FIG. 27 : Rotation of jointed skeletal model.

FIG. 28 : Simulation of DJS model including terrain data.

FIG. 29 : DJS model derived trajectory animation.

FIG. 30 : Block diagram of image DJS overlay and augmented cognitionsystem and process.

FIG. 31 : Block diagram of data acquisition and scoring combining livevideo and launch sensor data.

FIG. 32 : Ball drop position and scoring based on DJS model derivedtrajectory simulation.

FIG. 33 : Combined video and ultrasonic launch sensor data capture.

FIG. 34 : Torque data extraction from video and ultrasonic launch sensordata.

FIG. 35 : MEMs sensor position, velocity, and torque data extraction.

FIG. 36 : Golf ball drive position and velocity MEMs sensor data.

FIG. 37A: Ball sensor data acquisition.

FIG. 37B: Club sensor data acquisition.

FIG. 38A: Graphical components of realtime DJS overlays (front view).

FIG. 38B: Graphical components of realtime DJS overlays (rear view).

FIG. 38C: Scaling graphical components of DJS model to different sizesusing proportional and graphical edge methods.

FIG. 39 : Image capture and DJS screen display on golf course.

FIG. 40 : AI-based separation of high-resolution video into DJS model,low-resolution video, and silhouette video.

FIG. 41 : Overlay comparison of silhouette video to expert DJS model.

FIG. 42 : DJS overlay image projected onto heads-up display glasses.

GLOSSARY

AR Glasses: Wearable Augmented Reality (AR) devices that are worn likeregular glasses and merge virtual information with physical informationin a user's view field. AR Glasses, also known as smart glasses, areusually worn like traditional glasses or are mounted on regular glasses.

Artificial Intelligence (AI): A branch of computer science dealing withthe simulation of intelligent behavior in computers, or alternativelythe capability of a machine to imitate intelligent human behavior. TheTuring Test is one measurement of the successful realization of AI.

Augmented Cognition: A form of human-systems interaction in which atight coupling between user and computer is achieved via physiologicaland neurophysiological sensing of a user's cognitive state or throughaudio-visual sensing and feedback.

Augmented Reality (AR): A technology that superimposes acomputer-generated image on a user's view of the real world, thusproviding a composite view.

Cognition: The mental action or process of acquiring knowledge andunderstanding through thought, experience, and the senses. Cognition maybe achieved biologically in the brain or may be emulated throughArtificial Intelligence.

Contemporaneous Feedback: Information feedback comprising electrical,visual, auditory, or other sensory mechanisms occurring in real time,i.e. with minimum delay, from the action or event being monitored ormeasured.

Heads-up display (HUD): A transparent or miniaturized display technologythat does not require users to shift their gaze from where they arenaturally looking. A HUD should not obstruct the user's view. Some, butnot all, AR Glasses may be considered as HUDs.

Image Capture: The process of obtaining a digital image from a visionsensor, such as a camera, or a camera phone. Usually this entails ahardware interface known as a frame grabber capturing a succession ofvideo frames, converting the image's analog values (gray scale) todigital, and transfers the files into computer memory or transmittedacross a communication network. The conversion process is oftenaccompanied with image compression.

Image Overlay: A type of process or technology combining multiple imagesinto a common graphical representation displayed on a video screen, orvia VR headset or AR glasses. A dynamic image overlay performs imageoverlay on a frame-by-frame basis for rapid or real time playback.

Kinesthesia: In biology, a sense mediated by receptors located inmuscles, tendons, and joints and stimulated by bodily movements andtensions, or in robotics the application of sensory data to control themovement of mechanical appendages or prosthesis. Also known askinesthesis or kinesthetic control. Kinesthesia based learning is alsoreferred to as Psychomotor Learning.

Latency: In computer and communication networks, the amount of timedelay before a transfer of data begins following an instruction for itstransfer. In discontinuous or sporadic processes, Latency may beconsidered as start-up delay.

Learning: The acquisition of knowledge or skills through experience,study, teaching. training, and practice.

Machine Learning: The application of artificial intelligence (AI) thatprovides systems the ability to automatically learn and improve fromexperience without being explicitly programmed.

Model Parameters: Variables used to match a mathematical model tomeasured data and to predict behavior, stimulus-response patterns, andKinesthesia.

Muscle Memory: The learning and repeated reinforcement of psychomotorskills where an athlete or student is able to consistently repeat amovement or skill without being consciously aware of their learnedactions. The repetitious practice and psychomotor learning, gymnasts,drummers, golfers, and baseball pitchers and batters exhibit musclememory.

Propagation Delay: In computer and communication networks, the amount oftime it takes for a signal to travel from its source or sender to areceiver or recipient. It can be computed as the ratio between the linklength and the propagation speed over the specific medium. PropagationDelay may be considered as transport time for a data packet across acommunication link or through a network and does not generally includeLatency.

Psychomotor Control: The closed loop control of muscles and movementwhere afferent nerves detect skeletomuscular movement, position, orforce, and via nerve transduction through the peripheral nervous System(PNS) and central nervous system (CNS) informing the brain of muscleaction, and where the brain cognitively responds to the stimulus bysending instructions to corresponding efferent nerves on the same muscletissue to adjust movement.

Psychomotor Learning: The process of learning involved in developingmotor skills such as movement, coordination, manipulation, dexterity,grace, strength, and speed used in athletic activity, or needed in theoperation of tools or instruments.

Turing Test: A method of inquiry in artificial intelligence (AI) fordetermining whether or not a computer is capable of thinking like ahuman being. Turing proposed that a computer is said to possessArtificial Intelligence if it can mimic human responses under specificconditions.

Virtual Reality (VR): The computer-generated simulation of athree-dimensional image or environment that can be interacted with in aseemingly real or physical way by a person using special electronicequipment, such as a helmet with a screen inside or gloves fitted withsensors. Also referred to as “artificial reality”.

VR Headset: A head-worn apparatus that completely covers the eyes for animmersive 3D experience. VR headsets are also referred virtual realityglasses or goggles.

DESCRIPTION OF THE INVENTION

Given the innumerable problems in producing video image overlays thatmatch the size and proportions of a student or trainee to a reference orexpert's movement or timing, the application of video images, scaled orunscaled, is not applicable or useful for psychomotor learning.Moreover, such video content lacks the contrast or camera angle for atrainee to clearly observe the movements of the instructor, reference,or coach's actions. Recorded videos, in fact, contains superfluousimages such as trees, landscapes, crowds, weather and other artifactsthat only obscure the important content and impede the use of imageenhancement technology.

As described herein we propose an inventive method and apparatus toachieve contemporaneous feedback for psychomotor learning through theapplication of dynamic jointed skeleton (DJS) motion modeling andmirroring enhanced by AI-hosted augmented cognition technology, methodsused to adapt the training procedures to the user's learning. Referringto FIG. 16 , as shown by the modified retention pyramid 190, theenablement of concurrent psychomotor learning accelerates theacquisition of new skills while improving learning retention. Enabled byartificial intelligence and machine learning to improve training processefficiencies, contemporaneous feedback for psychomotor learning offersthe potential of improving retention beyond 85% to 90%, as illustratesby the topmost pyramid piece 191. In this sense, as the student learnsfrom the system, the system adapts, learning the student's behaviors andadapting its training process thereto. For example, if a golfer spends alonger duration at the top of their swing, i.e. with the club abovetheir head, than other golfers, the system will realize their behaviorand not commence the downswing demonstration so quickly. In this mannerthe golfer doesn't feel rushed or uncomfortable with the instructivetraining images needed for psychomotor learning.

The method of contemporaneous feedback for psychomotor learning throughaugmented cognition involves two fundamental steps. In the first step,referred to herein as “image capture and model extraction” referencecontent, generally a video of an expert or coach, is converted into abehavioral model and stored in a model library for later or possiblycontemporaneous use. During this process, a reference video of an expertor coach is converted by an artificial intelligence (AI) engine into adynamic jointed skeleton (DSJ) model—a physical and behavioral modelcapable of producing a sequence of images that describe the essentialelements of the instructor's actions and motions. In the absence ofsufficient information, the AI engine extracts a model to the best ofits ability given the quality of its input, generally video content.With access to a library of prior model extractions, the AI engineadapts its model extraction algorithms using machine learning (ML) toimprove the efficiency and accuracy of the model over time.

The addition of physics-based models and equipment specificationlibraries further improves the intelligence of the AI used in theextraction process. The resulting model represents a kinestheticdescription of an expert or instructor's actions scalable to match thesize and proportions of a student. In one such model, described here asa “dynamic jointed skeleton” or DSJ, the model parameters are convertedto graph elements of varying length edges and vertices that define theallowed motions of one edge to another. The model parameters comprisenumeric variables used to match the Dynamic Joint Skeleton'smathematical model to measured data. Once calibrated to maximize modelaccuracy, the DJS model can be used to visually depict complex movement,to predict kinesthetic behavior, and stimulus-response patterns. Whilethe disclosure relies on the use of a DSJ model, the disclosed methodmay be adapted to other forms of dynamic motion models, for exampleholographic 3D models, as they become available.

In the second step, referred to here as “image DJS overlay and augmentedcognition” a live or processed image of an athlete is displayed in avisualization device superimposed with an interactive image of thedynamic joint skeleton or other applicable image renderings (such as ahologram), whereby the trainee can mimic the actions of the referencemodel, the dynamic model scaled to the exact proportions of thetrainee's body. The dynamic model includes event triggers and employssynchronization methods, adapting the model's movement to synchronize tothe trainee's actions, incrementally adjusting the model to the expert'sactions until the trainee and the model are both executing the sameactions in accordance with the trainer or expert's actions used tocreate the reference model. Since the image DJS overlay is dynamic, i.e.involving movement of both the reference DJS model and the trainee, theAI visualization system adapts its instruction methods to betterinstruct the trainee in a step-by-step process.

Machine learning of an AI system based on the bidirectional data flow ofan AI-based instructor teaching a student and the student's actionsaffecting the way the AI instructor provides instruction is referred toherein as “augmented cognition.” Moreover, since the image DJS overlayoccurs in real time, i.e. comprising “contemporaneous feedback” to thetrainee, the learning curve is accelerated and the subsequent forgettingcurve depth and duration is minimized even in the extended absence of alive coach. Using the disclosed methods adapting augmented cognition tocontemporaneous feedback through visual based image DJS overlays,psychomotor learning is thereby accelerated. Other forms of feedback tothe trainee may also be employed including tactile, haptic, audible orother methods.

In its advanced form, evolution of the AI engine may ultimately lead tothe synthesis of an AI reference model that outperforms experts in thefield used to educate the ML system during AI pattern imprinting. Later,these same behavioral models may be used to direct the actions ofrobots, for example, leading to a robotic golfer or tennis player withexpert skills.

Image Capture Sequence and Model Extraction—The first step in thedisclosed method and apparatus for augmented cognition for psychomotorlearning involves the extraction of a behavioral model. As shown in theflow chart of FIG. 17 , the process starts with a video of an expert,trainer, or coach as a reference video 200. The video content may beprerecorded or comprise a live video stream from a camera recording ademonstration, teaching session, instruction, or from a live competitiveevent. A digital filter process 201 then optionally modifies the imageto improve contrast and enhance the images on a frame-by-frame basis byremoving or diminishing the presence of extraneous background features.For example, digital filtering shown in FIG. 18 is able to completelyremove background content 208 from an image of golfer 207 leaving ablank background 209.

The output of this filtering process 201 is then fed into an AIprocessor operation 202 to extract a DJS model file 203, a dynamicjointed skeleton that captures the key features of the reference image,specifically, the expert or coach and any associated equipment involvedin the motion. Exemplified in FIG. 19 , the use of artificialintelligence is able to identify the shape of a human body 210 andidentify human body parts including the head 211 a, shoulder joint 211b, elbow 211 c, hand 211 d, hip joint 211 e, knee 211 f, ankle 211 g,and toes 211 h. Once extracted, the resulting DJS model 220 connects theidentified joints with connectors representing inflexible components,such as neck bone 212 a; scapula or shoulder blade 212 b; upper-arm boneor humerus 212 c; the forearm bone 212 d, the spine and rib cage 212 e;the hip bone or pelvis 212 f: the femur or upper leg bone 212 g; thelower leg bone 212 h comprising the tibia and the fibula, and variousbones collectively comprising the foot 212 i. To identify stick orskeleton models including joints, we refer to the skeletal model 220 asa ‘jointed’ skeleton to highlight that joints between bones areexplicitly identified.

During the parameter extraction process these physical attributes areparameterized, i.e. the dimensions of the body parts are converted intonumerical variables defining the reference athlete's body shape. Thedimension of each parameter value is then measured and a file createdfor the unscaled model accurately matching the reference image. Theseparameters may, for example, include without limitation:

-   -   The variable x_(sb), describing the width of the shoulder blade        212 b,    -   The variable x_(ua), describing the length of the upper arm 212        c,    -   The variable x_(fa), describing the length of the lower arm also        known as the forearm 212 d,    -   The variable x_(t), describing the length of the body's trunk or        torso as measured from the shoulder blade 212 b and the waist        212 f,    -   The variable x_(ul), describing the length of the upper leg 212        g also referred to by their attached muscles quadriceps,    -   The variable xii, describing the length of the upper leg 212 h        also referred to by their attached muscles the calves,

These variables are used to define the relative size of an athlete'sbody parts and their overall size. By parameterizing the DJS model asdescribed, the scalable model is created—a model that can be adjusted tomatch the size and shape of any athlete. Combining the scalable modelwith classical physics, the method is capable of predicting the impactof a change in an athlete's physical attributes from the originalreference model. For example, if an athlete's legs are shorter, themodel can be used to predict changes in force and launch angle of a golfball during tee-off, and adjust the swing accordingly to produce thesame result as the expert despite the fact that the golfer is taller orshorter than the expert used to create the reference model.

The model therefore is not simply adjusted for an athlete's size butmust also be adjusted in accordance with physics to achieve the desiredperformance, compensating for any size changes. In essence the questionis not simply “how would Tiger Woods swing the club if he were myheight?” but more importantly “how would Tiger Woods have to adjust hisswing to produce the same result if he were my height?” Only bysimplifying an expert's movements, i.e. their dynamics, into a dynamicjointed skeleton model, can force, club acceleration and ball velocitybe modeled in a succinct and rapid manner using a minimal number ofcalculations. A static model cannot predict force.

In mathematic vernacular, the joints of a DJS model are referred to as‘vertices’ and the connecting bones are defined as ‘edges.’ As in anyphysical system subject to Newtonian mechanics, i.e., classical physics,the relation of movement of edges at a vertex are subject to physicallaws of motion in response to force or torque. As such physics can beused to govern the dynamic movement of the model in time, hence theacronym DJS for ‘dynamic’ jointed skeleton. Given that the DJS isgoverned by physics, an extracted model can be analyzed for linear andangular position, velocity, and acceleration by analyzing the timemovement of the graph edges with respect to the vertices and otheredges. To extract forces in an analysis,[https://en.wikipedia.org/wiki/Newton %27s_laws_of_motion], we mustemploy Newton's 2^(nd) Law which states the linear vector equation F=mafor linear motion, where m is mass, a is an acceleration vector, and Fis a vector force. Alternatively for angular or rotational movement likeswing a golf club or a baseball bat, it is convenient to use therotational version of the 2^(nd) law τ=Iα where τ is a torque vector, Iis the moment of inertia, and α is an angular acceleration vector[https://brilliant.org/wiki/rotational-form-of-newtons-second-law/].Given the description of body mass for the athlete derived by knowinghis weight, and the mass of material and density composition of theequipment specified in an equipment specification library 206 shown inFIG. 17 , AI operation 204 employs artificial intelligence to extractvector force model parameters 205 for further analysis. The relevantforce model parameters depend on the action being performed. Forexample, in a golf tee-off, an extracted force analysis involves theforce which the ball is hit and the force with which the club strikesthe ball. Through the use of physics such information can be used tocompare one athlete's performance to another or to evaluate how the ballwill travel on a given course. Since the mass of the golf club affectsmomentum transfer and ball launch velocity, the precise weightcharacteristics can be downloaded from an equipment specificationlibrary 206 in order to improve the absolute accuracy of the vectorforce model parameters 205 used in compact dynamic DJS model 225.

In general, all motion occurs at explicitly identified joints. In theDJS depiction shown in FIG. 19 two exceptions to the rule that motiononly occurs at a joint should be mentioned. First of all, theunidentified virtual neck-shoulder joint 213 between neck bone 212 a andshoulder blade 212 b does allow a limited degree of rotational movement.Variables may be used to specify any body dimension including the lengthof the athlete's forearm x_(fa), the length of their upper arm x_(ua),the width of their shoulder blade x_(sb), the length of their torsox_(t), the length of their upper leg x_(ul), and the length of theirlower leg xii. A precise physical description is not needed to predictmovement. For example, the spine-ribcage 212 e is depicted graphicallyas a triangle meeting the hipbone 212 f at a single unidentified virtualhip-spine joint 214, whereby a limited range of rotation is allowed. Analternative representation is shown in FIG. 20 where the neck-shoulderjoint 213 and hip-spine joint 214 are explicitly illustrated in DJSmodel 221. A more accurate DJS model 222 explicitly separates spine andrib cage 212 e into two components—an upper thoracic spine and rib cage212 m, and a lower spine or lumbar 212 n.

The addition or more vertices complicates the DJS model, slowingsimulation and real-time animation. As such, care should be taken not toadd any vertices unless it is needed to properly model a movement. Forexample, modeling the foot may or may not improve model accuracy. Overlycomplex models make timely calculations difficult and do not necessarilyimprove accuracy, as they require more variables to be used in theparameter extraction and model creation process.

Once extracted, a DJS file can be used to imitate the motion of anyperson as a kinematic model able to generate a video file of the motionor action such as the DJS model for walking depicted in FIG. 21 .Although the methodology of a moving stick model was first realized byWalt Disney in 1929 [https://www.youtube.com/watch?v=oyrGwRWKtJg] theability for intelligence to automatically analyze a picture and extracta multi-jointed skeleton model did not occur till the development ofrobotic vision in the 1990's in a process referred to a digital thinningalgorithm referred to as “skeletonization”[http://homepages.inf.ed.ac.uk/rbf/HIPR2/skeleton.htm]. The process ofAI-based skeletonization, using artificial intelligence to identifycomponents of a human, animal, or machine and extract a skeleton isrecent and still ongoing as a subject of deep learning research[https://www.youtube.com/watch?v=3ZhQKmSbNug].

The extraction of dynamic jointed skeleton (DJS) models for psychomotorlearning disclosed herein is however unique, as it requires theextraction to extract physical characteristics that affect precisionmovement for a specific result according to the laws of physics and topreserve these subtle differences in the model. For example, creating asimple model of a person swinging a golf club is no different thananimation, but modeling an athlete's action to predict performancerequires physics based models. Animation, by contrast, need not followthe laws of physics. For a kinematic model for psychomotor learning tobe useful, however, it must be physically accurate.

Capturing the precise movements of a tennis pro athlete, a master golfpro, or a world-class neurosurgeon requires a high resolution extractionof precise movements, stored with any associated equipmentspecifications involved in the action. For example, the length andweight of a golf club or of a tennis racket affects which DJS modelneeded to precisely predict the desired motion. The shape of a scalpelcould be the difference between a successful surgery and inadvertentlysevering a nerve.

And although a library of good DJS models is a key element in qualitypsychomotor training, it alone is not enough. It is also important tomovement by separating intervals of smooth movement and interruptions bydiscrete time segments identified by start and stop “triggers”.

Motion capture of a golf swing are shown in successive images of FIG.22A including start A, backswing B, and top of swing C corresponding toimages 98 a, 98 b, and 98 c respectively. The conclusion of thebackswing in image 98 c occurs at a time referred to a t=0⁻, just aninstantaneous moment before t=0 where the club's velocity is zero, i.e.v=0.

As a separate movement from the backswing, the downswing commences at aninstantaneous moment called t=0⁺ after the completion of the backswingat t=0, also represented by image 98 c. Following top of swing C, thedownswing progresses through downswing D into drive E when the clubstrikes the ball, to follow-through E and ultimately to finish G, asequence represented by images 98 c, 98 d, 98 e, 98 f, and 98 grespectively. The equivalent dynamic jointed skeletons include shoulder250, left arm 251 a, left leg 252 a, right arm 251 b, and right leg 252b along with club 253. As shown, the video sequence 98 a to 98 gcorresponds to skeletal models 240 a through 240 g respectively.

The golf club velocity corresponding to these positions is shown in FIG.22B where velocity 241 a represents the commencement of the backswing242 represented by negative velocities (v<0) and by negative time, (t<0,i.e. times before t=0) corresponding to position 240 a, velocity 241 bcorresponds to position 240 b and velocity 241 c corresponds to position240 c at time t=0, just after the completion of the backswing 242 attime t=0⁻ and just before the beginning of the downswing 243 at timet=0⁺.

At the t=0, club velocity (in calculus, the first time derivative ofposition) changes polarity from negative to positive, and clubacceleration (in calculus, the second time derivative of position or thefirst derivative of velocity) changes from negative (deceleration at thetop of the backswing) to positive (accelerating at the commencement ofthe downswing). As such, either velocity or acceleration data can beextracted from successive video frames and used to identify the instantthe backswing ends or the downswing commences. Downswing 243 is therebygraphically represented by positive values (v>0) including peak velocity241 e corresponding to position 240 e and finishing at velocity 241 gwhen the swing follow-through is complete. As such backswing 242 and thedownswing 243 can be modeled as two smooth actions separated by apolarity reversal in direction and acceleration. This polaritytransition can be used as a “trigger” beneficial in controlling modelplayback for the purpose of synchronization.

One example of the need for a triggered DJS model is to accommodatediscontinuous movement. For example, some golfers stop for a moment atthe top of their backswing before commencing their downswing, ratherthan immediately commencing the downswing as one continuous motion. Thiscase is represented in the graph shown in FIG. 22C where downswing 243doesn't instantly follow backswing 242, but instead is delayed by delayinterval 244 of a duration Δt_(d). The delay varies dramatically withathletes, ranging from 100 ms (almost instantaneously) to up to 10seconds.

An athlete who feels comfortable waiting five seconds at the top oftheir swing cannot comfortably learn psychomotor skills from watching avideo of an athlete who holds his club for less than a second at the topof their swing because they will feel rushed trying to catch up with thevideo. By partitioning the DJS model into discrete pieces of continuousmovement defined by event triggers, delays and motion interruption canbe matched to the student's needs. Consistent with FIG. 22B, in FIG. 22Ctime t=0 represents the end of the backswing 242. Therefore thebackswing commences (when velocity 241 a is zero) at the time t=−t_(bs)corresponding to trigger 1 and concludes at time t=0. Downswing 243however does not commence until trigger 2 after a variable delay Δt_(d)depicted by interval 244. Starting at velocity 245 equal to zero,downswing 242 therefore does not commence until time t=Δt_(d). Lasting aduration Δt_(ds) downswing 242 does not conclude until the timet=Δt_(d)+Δt_(ds) when velocity 242 g reaches zero. The integration oftrigger 1 and trigger 2 into the DJS model therefore allows modelplayback to be broken into two pieces backswing 242 and downswing 243separated by a variable delay 244 specified by detecting a condition,e.g. a student commencing their downswing, and commencing playback byactivating trigger 2. In this manner, the student went feel rushed orpressured into trying to match the reference video of another athlete.

The same principle of trigger based discontinuous playback can beadapted to learning other psychomotor skills such as surgery, where anincision is made in two strokes rather than in one continuous movement.

As shown in the exemplary video frames and extracted skeletal models ofFIG. 23 , a compact dynamic jointed skeletal model 225 minimizes theerror in predicting all the respective movements in captured motionsequence 255, shown in FIG. 24 . Mathematical models to describe actualmeasured data of curve 255 include linear model A shown by curve 256 a,exponential model B shown by curve 256 b, polynomial model C shown bycurve 256 c, and higher order polynomial D shown by curve 256 d.Constants in the mathematical model are adjusted to minimize overallerrors to maximize the curve fit accuracy to actual data represented bycurve 255. The adjustment of model curve fitting parameters to minimizeerrors is referred to parameter extraction. Parameter extraction is animperfect process where the accuracy of a curve fit over a limited rangemay be increased by sacrificing accuracy over the full interval, or viceversa. Using an artificial intelligence engine to interpret a series ofgraphical images, however, errors can be minimized over repeated eventsor video sources, allowing the system to better “learn” what it islooking at.

As shown in FIG. 25 , once a compact dynamic DJS model 225 is extracted,its reference source video 200, i.e. source data file 260 is combined infile generation process 263 with GPS 3D terrain course map 262 and filesource data information 261 to file 264 for cloud storage. Source datainformation 260 may include the name of an athlete, the specific golfcourse, the name of the event, the date of the event, as well as theathlete's score for each hole (measured against the hole's par), andmore. Terrain information 262 may also be included and may optionallyinclude wind information by time (although this data is difficult toextract from pre-recorded files where weather information isunavailable).

As depicted in FIG. 26 , playback of stored files comprises user 268selecting a specific expert file from cloud file storage 264 via userinterface 265. The download comprises compact model 225 and course map262. UI/UX control 265 instructs DIS model visualizer 266 to process thedata and model using artificial interface (AI) engine 267. AI processingincludes 3D rotation, terrain, animation, parametrics, and calculationof performance evaluation, i.e. scoring. The generated video from theDJS model 225 may include sampled frames 240 c, 240 d, 240 e, and 240 f,and interpolated frames such as 240 z generated from the DJS model. Themodel can be “played” like a movie by executing an evaluation ofmovement on a frame-by-frame basis over time. The model can besynchronized to live video trigger or run autonomously.

Another feature of DJS model 225 with AI engine 267 shown in FIG. 27includes 3D rotation of image 240 d for side view 240 x or rear view 240y. Based on physical models the rotation can be performed even thoughonly a single camera is used to capture a video image. In this manner,the DJS model always can be rotated to match any available video sourceor even compared against multiple video sources.

In FIG. 28 , AI engine 267 is able to combine DSJ model 240 g withterrain 268. By combining a golf course terrain with a DJS model, astudent can analyze how an expert played a particular hole. Beneficiallya student can play the hole themselves, comparing their performanceagainst the expert.

In FIG. 29 , DSJ model 240 f is combined with drive animation 271 todisplay ball trajectory 269. The analysis of any video can categorize270 the trajectory result 272 parametrically as a slice, hook, fade,draw, push, pull or a perfect “pure” stroke. Scoring may also beassigned to the drive and used in evaluating competitive performance.

Image overlay and Augmented Cognition—The process and apparatus ofaugmented cognition for psychomotor learning using a kinematic DJS modelwith contemporaneous feedback via A-B image DJS overlays is illustratedin FIG. 30 . As shown, golfer 301 on golf course 300 wishes to learn,for example, how a pro played the same course. By placing smartphone 302on a tripod to monitor the golfer's swing, a live video file 304 isprocessed by an artificial intelligence using intelligent imageprocessing within AI engine 310, preferably implemented withinsmartphone 302, although a separate or dedicated AI processor engine mayalso be used. Operations within AI engine 310 occur live, i.e. in realtime, involving a complex and inventive sequence of operations asfollows:

-   -   AI-engine 310 downloads DJS model file 203 of a selected expert        for training purposes.    -   AI-engine 310 receives live video streaming file 304 of golfer        301 as a continuous input.    -   On the fly, i.e. continuously, AI-engine 310 removes superfluous        background content of golf course 300 from live video streaming        file 304.    -   Optionally, launch monitor 316 measures parametric data from        golf tee-off and provides its measurements to AI-engine 310.    -   AI-engine 310 identifies the image of golfer 301 in the video        stream using artificial intelligence-based pattern recognition.    -   A sample of the video images from video streaming file 304 is        analyzed to extract the height and the body proportions of        golfer 301 including the golf club, lengths of upper and lower        legs and arms, torso length, etc.    -   A DJS model 203 selected from the model library loaded into        AI-engine 310 is adjusted to match the height and body        proportions of golfer 301.    -   A set of vector force model parameters 205 (including any        equipment related specifications) is loaded into AI-engine 310        and the DJS model is adjusted for the proper acceleration needed        to calculate the same force and ball trajectory as the original        reference library expert's performance.    -   In real time, AI-engine 310 outputs overlay 311 comprising the        live image of golfer 301 and the scaled version of DJS model 203        at the same size and body proportions as golfer 301 but with        motions matching the performance of the expert in the DJS        reference library.    -   The DJS overlay 311 is wirelessly transmitted to visualization        device 313 worn by golfer 301 allowing the athlete to compare        their live motions to the DJS skeleton image overlaid upon their        own live video image.    -   The image overlay 311 of the DJS model is synchronized to the        motions of golfer, triggered by the golfer's motions including        for example the commencement of the backswing and again at the        commencement of the downswing. DJS model playback pauses until        the golfer's next swing. AI-engine 310 dynamically changes its        instruction images to gradually become more closely matching the        expert captured in the reference model,    -   AI-engine 310 also outputs feedback analytics 314 which may be a        report summarizing the golfer's performance or may include real        time analytical data such as club angles, swing planes, etc.        displayed as part of image overlay 311.

As shown in FIG. 31 , AI-engine 310 optionally scores feedback analyticsfrom launch sensor 316 data and video streaming file 304, where thecalculated score 315 may be used to measure the golfer's performance,including comparing the golfer's swing to the swing of an expert. Themeasured data may also be used to measure the golfer's tee-offperformance 322 against some evaluation criteria (e.g. angle, speed,calculated drive distance, etc.). The use of evaluation criteria providea quantifiable measurement of an athlete's performance, including thefollowing features:

-   -   Scoring 315 may be uploaded to data cloud 330 for comparing        golfers 331 a and 331 b for competition, tournaments,        gamification, rewards, tokenization, and gambling.    -   Combined with map details shown in FIG. 32 , the launch        analytics may also be used to calculate ball trajectory 332        across a course and parametrically scored 315 for the ball's        final destination 334 including the distance to the hole 336,        landing on or off the green 335 a or in the rough 335 b or 335        c, landing in a water or sand trap 335 d, etc.        Performance evaluation can be used on a real golf course.        Alternatively, the evaluation method can be applied to a golf        simulator, where the athlete practices by hitting the ball into        a net and but the system evaluates the tee-off performance as if        the athlete was on a real golf course. In this manner, a golfer        can practice by following the actions of a professional or        expert depicted by the DJS model but evaluate their performance        against the course or against other golfers using the simulator.

Although a smartphone or video camera combined with artificialintelligence can be used to evaluate an athlete's performance, othersensors may also be used in combination with the disclosed psychomotorlearning system. For example, launch sensor 340 shown in FIG. 33combines video image 342 with differential ultrasound or LIDAR (laserlight-based radar) signals 341 a and 341 b to more precisely measure agolfer's swing than video alone can extract. Processed by AI-engine 310,the launch sensor 340 data can be used to precisely detect hand-angle344 and club position 343, shoulder position 346, arm position 345, andwaist angle 347. By analyzing a sequence of frames over time, positionaldata can be used to calculation swing speed and torque, includingeffective and applied arm torque 351 a and 351 b, effective and appliedwrist torque 352 a and 352 b, and shoulder torque 350 as depicted inFIG. 34 .

Torque, acceleration, and positional data optically measured by camera,while conveniently monitored, lacks accuracy. Additional accuracy can begained by including microelectronic machine (MEMs) sensors in balls,clubs, and other equipment. As shown in FIG. 35 MEMs sensor 360 is ableto measure a number of parameters 361 including position, velocity,acceleration, and torque versus time. The sensor relays its data to areceiver via a low power RF link 362 and internal antenna 363 using lowpower Bluetooth, low power WiFi (such as 802.11ah), or telephonicallyusing 4G/LTE or low bandwidth power-saving modes of the 5G communicationprotocol. In FIG. 36 , down-range flight trajectory 269 of golf ball 364can be detected using MEMs sensor data 360 a including signal delay, airpressure, etc., and relayed back to a receiver using RF communication.Since golf ball 364 has a dimpled but otherwise uniform surface(spherical symmetry), torque information is not important in predictinga ball's trajectory. As such, sensor 360 a need only detect relativeposition and acceleration for not rotational velocities.

As shown in FIG. 37A, baseball 365, however, is not sphericallysymmetrical because of its stitching. Baseball sensor 360 b must measuretorque in order to predict ball trajectory. This information isdifficult to measure using a camera. As shown in FIG. 37B, sensors 360 cand 360 d in golf club 366 and sensors 360 e and 360 f in baseball bat367 must detect torque to precisely describe a ball's trajectory.

As described, the AI-based system exhibits augmented cognition wherebythe behavior of the golfer is trained to match the expert's performancewhile the AI-engine learns best how to gradually improve the golfer'sperformance. In the described system, the golfer can compare theiractions to an expert reference using a real-time DJS overlay. As shownin FIG. 38A a front side view combines DJS reference model 370 a withlive filtered video 372 a, e.g. a silhouette image, to produce realtimeDJS overlay 373 a.

The process of scaling the DJS model to the live athlete or studentallows the unscaled original DJS model 370 a, having a height x_(h)(ref)to be scaled in size to fit the height x_(h)(live) of the live athleteimage 372 a. The resulting composite image, i.e. overlay 373 a, therebycomprises a representative image of the live athlete 372 a at full sizeand a scaled version of the DJS reference model 370 c both consistentwith the height x_(h)(live) of the live athlete image 372 a. Usingartificial intelligence this scaling can be performed once at the onsetof the live session or can be performed dynamically and repetitively togradually improve the accuracy and fit of the model during each practicesession.

Even without a rear-view camera, the AI system can also calculate anddisplay the rearview image of the golfer in real time as depicted inFIG. 38B, combining a 180° rotation of DJS reference image 370 b with arotated image 372 b, to produce real-time DJS overlay 373 b. As such aDJS model can easily be rotated to match any image perspective of a livevideo feed, where video images cannot.

Aside from its advantage in image rotation, an AI-based graphicsprocessor can execute scaling of a DJS model to match a live image orvideo feed of a student athlete in several ways. As shown in FIG. 38C,the unscaled DJS model 370 a can be scaled proportionally or piecemealusing graphical edges, the segments between skeletal vertices. Inproportional scaling all the DJS model elements including the arms,legs, and torso are scaled by a the same proportionality factor αwhereby the height x_(w)(ref) of the model's waist from the ground isscaled to a value αx_(w)(ref) and the total height is scaled from avalue x_(h)(ref) to αx_(h)(ref). The golf club length x_(c)(ref) mayalso be scaled proportionally or alternatively be scaled to match theactual length of the club as specified by the manufacturing data sheet.

In a graphical edge scaled DJS model, every edge in the model is scaledseparately to match the video frame of the live athlete, whereby theheight x_(w)(ref) of the model's waist from the ground is scaled to avalue δx_(w)(ref) and the total height is scaled from a value x_(h)(ref)to βx_(h)(ref) including separate scaling factors for the upper andlower legs, the torso, and the upper and lower arms. Even the golf clubcan be scaled separately from x_(c)(ref) to γx_(c)(ref).

Although filtered video images 372 a and 372 b are convenientlydisplayed as silhouettes, shadow or glow, the filtered image can alsocomprise an outline, a low-resolution video, or an animated depiction ofthe golfer. Key advantages of this approach compared to any availabletraining aid today include

-   -   The DJS reference model is scaled in size to the golfer or        athletic trainee. Reference videos cannot easily be scaled        especially when the source data comes from video archives, some        videos dating back several decades ago.    -   The DJS reference model can be rotated to match the camera angle        of the live image of the golfer or athletic trainee.    -   The DJS model is overlaid atop the live image of the golfer or        athletic trainee so the athlete doesn't have to compare two        side-by-side images, which requires the eyes to pan back and        forth between the two images thereby distracting the athlete.    -   The DJS model skeleton eliminates unnecessary detail of the        reference image of the expert athlete (such as hair, hats,        clothes, etc.), which can clutter the overlay and obscure        details of movement.    -   The video representation eliminates unnecessary detail of the        live athlete (such as hair, hats, clothes, etc.), which can        clutter the video and obscure details of movement.

In one embodiment shown in FIG. 39 , display of the overlay image 373 bcan be conveniently realized using a standing video display 402. Thelive video of golfer 403 used to create the overlay can be capturedusing a frontside camera 400 but may optionally include a rearviewcamera.

As shown in FIG. 40 , overcoming bandwidth limitations of real-timecommunication, high-resolution video 410 from camera 400 can be recordedand stored locally in non-volatile memory 411 while AI-engine 413filters the data stream to produce either/or (i) low-resolution videoand (ii) silhouette video 416 for limited-bandwidth transmission 419between RF-amplifier 417 and RF receiver 421 via antennas 418 and 420respectively. AI-engine 413 also can create a real-time DJS overlayvideo 373 b based on DJS model 414. Display 402 therefore can be used todisplay any combination of the video sources.

In particular the DJS overlay of a live golfer's silhouette 416 and askeletal model 430 of a reference or expert shown in FIG. 41 allows anathlete to immediately see what they are doing wrong and how theirmotion differs from a master athlete, e.g. the difference 433 betweenthe student's club position 431 and the expert's club 432 positioning.The value of contemporaneous feedback is that the student or trainee canimprove their swing or stroke with each attempt. Unlike a looping video,the expert only swings when the student swings their club. The movementis natural, where the expert's swing pauses at the top at t=0⁻ anddoesn't resume till it is triggered by the beginning of the student'sdownswing at t=0⁺. At the beginning of training, the AI-engineaccommodates the student's slow pace, learning what is comfortable tothe student. Gradually however, the AI-engine incrementally acceleratesthe pace until the student is matching the expert's timing andperformance till the skill is learned.

As shown in FIG. 42 , the DJS overlay image 444 can be displayed 443 ina heads-up display 440 using a projector 441 or other heads up orholographic display techniques.

The benefit of contemporaneous feedback in psychomotor learning usingaugmented cognition is applicable to a wide range of activitiesincluding sports such as diving, skating, skiing, golf, tennis,basketball, hockey, weight lifting, archery, and baseball, as well asprecision professional skills such as automotive repair, surgery, signlanguage, and defense related activities such as marksmanship, marshalarts, etc.

1. A method to teach psychomotor skills to a live athlete or studentcomprising a camera, an image processor, a dynamic jointed skeletonreference model and a display device visible by the live athlete duringpractice, whereby; the live athlete's movements are captured by a camerain real time as a succession of video frames and filtered to removesuperfluous detail; the image processor analyzes the live athlete'srelevant physical attributes from the captured video frame images thenscales the dimensions of the dynamic jointed skeleton to best match thelive athlete's body dimensions; the scaled dynamic jointed skeletalmodel generates images of a jointed skeleton as a motion sequence; thegenerated images of the jointed skeleton model are dynamically overlaidonto the live athlete's image to create a composite video imagecontaining both live and generated image content; and where thecomposite image is delivered to a video display unit for the liveathlete to observe thereby delivering a real-time visual comparison ofthe athlete's position and movements to that of the skeleton.